Goto

Collaborating Authors

 network weight



A Implementation details

Neural Information Processing Systems

The diagram of our proposed Neural Lad framework is illustrated in Fig.1. The pseudo code of the proposed Neural Lad is described in Alg. 1. The training time of Neural Lad for the toy dataset is about 8s per epoch. It is worth noting that we use larger weight decay for PhysioNe sepsis dataset to avoid over-fitting. Visualization of memory network enhanced scores.



0d5a4a5a748611231b945d28436b8ece-Paper.pdf

Neural Information Processing Systems

Neural networkmodels areknowntoreinforce hidden databiases, making them unreliable and difficult to interpret. We seek to build models that'know whatthey do not know' by introducing inductive biases in the function space.


067437c6d5d0369b6d09200bef89715b-Paper-Conference.pdf

Neural Information Processing Systems

In this paper, we propose HyperLogic: a flexible approach leveraging hypernetworks to generate weights of the main network. HyperLogic can be combined with existing differentiable rule learning methods to generate diverse rule sets, each capable ofcapturing heterogeneous patterns indata.


Appendix

Neural Information Processing Systems

The form Equation (A.8) allows ustoapply chain rule tocalculate the gradient ofthe normalized Again, the chain rule is applied for the derivative of the weight matrix. Based on the gradient, one step of optimization under learning rateฮฑ could be expressed in a neat matrix multiplication format, decomposed by orthonormal basesU = {u1,u2,...}andV = {v1,v2,...}. The whole pruning framework is detailed in Algorithm 1. Grow fractionฮฑ is a function of training iterations that gradually decays forstability oftraining. ImageNet experiments are run on 8NVIDIATeslaV100s. Accordingly,thescheduleofAC/DCneed slight modifications based on the original setting.


Weight Space Representation Learning with Neural Fields

arXiv.org Artificial Intelligence

In this work, we investigate the potential of weights to serve as effective representations, focusing on neural fields. Our key insight is that constraining the optimization space through a pre-trained base model and low-rank adaptation (LoRA) can induce structure in weight space. Across reconstruction, generation, and analysis tasks on 2D and 3D data, we find that multiplicative LoRA weights achieve high representation quality while exhibiting distinctiveness and semantic structure. When used with latent diffusion models, multiplicative LoRA weights enable higher-quality generation than existing weight-space methods.



Reviewer

Neural Information Processing Systems

We thank all the reviewers for the unanimous positive comments! Below we address questions raised by each reviewer. Q:"The operator on edges is never defined" " means stitching two adjacent paths (i.e., they share one endpoint) into a longer path. " Is this really an okay assump-6 Is anything given up by this assumption? Most parametric maps will not be injective." This is to define the cycle-consistency basis. Q:" It would be nice to show this form of cycle-consistency optimization enables novel capabilities, rather than Improved testing accuracy indicates better-learned representations.


90fd4f88f588ae64038134f1eeaa023f-AuthorFeedback.pdf

Neural Information Processing Systems

Thank you for all the helpful comments. Several related works were raised by the reviewers which we discuss here. We note that the authors have marked their ArXiv submission as containing errors. Each of their inner loops uses SGD to solve the distance-regularized objectives. First, we use the EMA of slow weights to adjust the training parameters during optimization.